Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync from odh to rhoai 2.11 #287

Merged
merged 19 commits into from
Jun 19, 2024

Conversation

VaishnaviHire
Copy link

No description provided.

zdtsw and others added 19 commits June 18, 2024 08:13
- we might see throttling in some cluster, this is just to uplift the
default value

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit 54ee87d)
…b-io#1019)

* kserve: configure servicemesh before deploying manifests

Jira: https://issues.redhat.com/browse/RHOAIENG-7312

kserve depends on odh-model-controller which it starts by deploying
manifests of Dependent Operator. The controller behaviour depends of
configuration (authorino) which is later deployed by configuring
servicemesh features. Here is a race, there are 2 checks in the
odh-model-controller for the presence of AuthorizationPolicy (which
is deployed by servicemesh configuration):

1) to add a type to the schema
2) to watch the objects of that type.

If the object appears in between odh-model-controller complains:

```
2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"}   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem    /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274   sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235

```

Move servicemesh configuration before deploying the manifests to
narrow the race window.

odh-model-controller will change their check in favor of checking
Authorino CRD to avoid the race completely.

Checking AuthorizationPolicy existance in operator/kserve would fix
it as well, but it's a delay for the reconcile loop (vs
odh-model-controller where it's done only on startup), so since
odh-model-controller is going to reimplement the check, keep it as
it is.

Also modelmeshserving component can deploy
odh-model-controller (thanks Vedant for pointing) if it is
enabled. The order is unspecified by due to implementation it will
happen before kserve configuration (order of field in the Components
structure).

Signed-off-by: Yauheni Kaliuta <[email protected]>

* kserve: get rid of extra enabled check for setupKserveConfig()

To avoid linter report:

components/kserve/kserve.go:97:1: cyclomatic complexity 31 of func `(*Kserve).ReconcileComponent` is high (> 30) (gocyclo)

move setupKserveConfig() call under "if enabled" branch below.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
As the first step of improving Feature DSL this PR brings godoc explaining purpose of each method in the builder chain.

(cherry picked from commit 5ce6306)
…tahub-io#1027)

They do not belong to pkg/deploy as they are about reading/writing cluster resources rather than deploying resources.

(cherry picked from commit d90983b)
…#1033)

context should be determined by the caller and propagated
down the call chain.

(cherry picked from commit 105adae)
…#1039)

* rename / chg FeatureOwner func

* remove unused old ownerref func

(cherry picked from commit 6583645)
…datahub-io#1042)

* RHOAIENG-5426: Updated pull request template with prerequisites

* PR template changes

(cherry picked from commit 244ca13)
- Source is already used elsewhere in Feature DSL, so we might want to
  avoid confusion
- Location fits the purpose of this field better

(cherry picked from commit 8921839)
…ndatahub-io#1051)

* tests: envtest: Add OperatorCondition CRD

Add OLM's[1] OperatorCondition/OperatorConditionList to the external
CRDs. Will be used in future patches.

[1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/manifests/0.18.3/0000_50_olm_00-operatorconditions.crd.yaml

Signed-off-by: Yauheni Kaliuta <[email protected]>

* tests: dscinitialization: add OperatorCondition CRD to schema

Following patches will change initialization to list the objects.

Signed-off-by: Yauheni Kaliuta <[email protected]>

* cluster: GetPlatform: replace CSV list with OperatorExists calls

Jira: https://issues.redhat.com/browse/RHOAIENG-8483

Depending of the way operators installed CSVs can be seen in all
namespace so listing of them causes producing N * M results (where N
is number of namespaces and M is number of CSVs) which is not
scalable in general and in practice causes timeouts on such large
clusters.

The function basically checks if ODH or RHOAI operator installed and
there is already such function in the package, OperatorExists(). So,
reuse it.

Signed-off-by: Yauheni Kaliuta <[email protected]>

---------

Signed-off-by: Yauheni Kaliuta <[email protected]>
(cherry picked from commit 261bbab)
…-io#1055)

- get in DSC and pass into compoment

Signed-off-by: Wen Zhou <[email protected]>
(cherry picked from commit 1b04761)
The actual use with resMap is inlined and moved to the caller.

This way we can construct plugins on demand and use them as building blocks instead.

(cherry picked from commit 7bf56a4)
Copy link

openshift-ci bot commented Jun 18, 2024

The following users are mentioned in OWNERS file(s) but are untrusted for the following reasons. One way to make the user trusted is to add them as members of the red-hat-data-services org. You can then trigger verification by writing /verify-owners in a comment.

  • devguyio
    • User is not a member of the org. User is not a collaborator. Satisfy at least one of these conditions to make the user trusted.
  • grdryn
    • User is not a member of the org. User is not a collaborator. Satisfy at least one of these conditions to make the user trusted.

@zdtsw
Copy link

zdtsw commented Jun 18, 2024

@zdtsw
Copy link

zdtsw commented Jun 18, 2024

i added label to opendatahub-io#1064 as well.

Copy link

openshift-ci bot commented Jun 19, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: VaishnaviHire, zdtsw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [VaishnaviHire,zdtsw]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zdtsw zdtsw merged commit 7127164 into red-hat-data-services:main Jun 19, 2024
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants